Conversation
|
@yiyuxuxu Hello, could you please review this pr, thank you. |
|
@yiyixuxu @sayakpaul Hello, could you please review this pr, thank you. |
| return x | ||
|
|
||
|
|
||
| class RopeEmbedderNPU: |
There was a problem hiding this comment.
Is it not possible to modify the existing RopeEmbedder class to account for the NPU hardware?
There was a problem hiding this comment.
maybe we could modify the current RopeEmbedder class same as RopeEmbedderNPU?It could work for both device
There was a problem hiding this comment.
No I mean we keep the class name as RopeEmbedder and modify the definition such that it also works on NPU alongside current devices.
|
@sayakpaul I merged RopeEmbedderNPU into RopeEmbedder class, please review the modification |
|
@bot /style |
|
Style bot fixed some files and pushed the changes. |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
@sayakpaul can you merge this pr? I have verified both gpu and npu case, they can both work. thanks |
What does this PR do?
Issue: The Diffusers Z-Image pipeline fails on Ascend NPU because aclnnIndex lacks support for complex64. When freqs_cis is stored as a complex tensor on the NPU and then indexed, the call crashes.
Fix: Rework the RoPE frequency handling so that the NPU never needs to index complex64 tensors.
Running Environment:
I run this model base on cache-dit(https://github.com/vipshop/cache-dit/) which depends on diffusers to inference.
Command:
python3 generate.py zimage --model-path /home/weights/Z-Image-Turbo --attn _native_npu
Error before fixed:

Result after fixed:

Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.